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Use of Commercial FPGA-Based Evaluation 
Boards for Single-Event Testing of 
DDR2 and DDR3 SDRAMs 
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Abstract — We investigate the use of commercial FPGA based 
evaluation boards for radiation testing DDR 2 and DDR3 SDRAMs. 
We evaluate the resulting data quality and the tradeoffs involved in 
the use of these boards. 

Index Terms — probabilistic risk assessment, radiation effects, 
reliability estimation, quality assurance, and radiation hardness 
assurance. 

I. INTRODUCTION 

Single-event testing of double-data-rate (DDR) 
Synchronous Dynamic Random Access Memories (SDRAMs) 
poses many logistical and technical challenges. Because DDR 
SDRAMs are commercial and in demand for commercial 
electronics, even obtaining single memory chips poses 
challenges. The chips are packaged in flip-chip ball grid 
arrays (FBGA), which preclude front-side irradiation and 
require thinning for the beam to reach the sensitive volume 
from the backside. The stringent timing demands of these 
devices complicates the task of board/tester layout, as the 
signal traces must be chosen appropriately so that all signals 
meet timing requirements. The high operation speeds, high 
density and high susceptibility to multiple error modes further 
complicate everything from tester design to data analysis. 
Moreover, all these challenges are expected to worsen for 
future generations. Each new DDR generation may require a 
new tester incorporating a state-of-the-art (SOTA) field 
programmable gate array (FPGA) to test these parts at speed. 
Use of commercial evaluation boards for Single-Event Effects 
(SEE) testing poses interesting trade-offs for some of these 
issues. Such boards typically interface to commercial memory 
modules, which are widely available and inexpensive. 
Evaluation board layout is optimized to ensure proper signal 
timing, and the controller, typically a commercial FPGA, is 
chosen to match the memory data rate memory. On the other 
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hand, preparing devices on a memory module for access to test 
ions is more complicated than preparation of an individual 
chip. The intellectual property (IP) designed for the FPGA is 
often not optimized for SEE testing — a demanding application 
in which data are accessed on every clock cycle. Finally, the 
use of memory modules complicates the task of controlling 
power to the device under test (DUT). Thus, current limiting 
will be ineffective at circumventing single-event latchup 
(SEL), and if single-event functional interrupts (SEFI) require 
power cycling for recovery, the entire board must be power 
cycled, necessitating a time-consuming reload of configuration 
data to the FPGA and re-initialization of the tester and DUT. 
In this manuscript we discuss the use of commercial FPGA- 
based evaluation boards to test DDR2/3 memories, paying 
particular attention to the negotiation of the above mentioned 
trade-offs. 

II. Test Devices and Evaluation boards 

We tested DDR3 M471B5773DH0-CH9 Dual In-line 
Memory Modules (DIMMs — each with 8 2-Gb FBGAs, 
K4B2G0846-HCH9 [1]). The tester was a Xilinx EK-V7- 
VC707-CES-G (Virtex-7 based) evaluation board [2] (see Fig. 
1). The DDR2 test parts were 512 MB and 1 GB DDR2 200- 
pin, DIMMs, M470T2863FB3-CE6 and M470T6464FBS- 
CE6, each with 4 or 8 1 Gb K4T1G084QF-BCE7 die [3]. We 
used a Xilinx HW-V5-ML506-UNI-G (VlRTEX5-based) 
evaluation board as the tester for the DDR2 DIMMS [4], We 
chose these DIMMs because they contained the DDR2 and 
DDR3 die of interest to us. Figs. 1 and 2 show examples of 
the commercial evaluation testers used in this work. 



Fig. 1: DDR3 evaluation board at the Texas A&M University test site, 
showing the thinned DDR3 Device under Test (DUT) and the Virtex-7 FPGA 
(under the fan) which controls the DDR3. 
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Fig. 2: The Xilinx Virtex-5 tester has parts mounted on both sides of the 
board. Most of the parts are mounted on the (a)-side with the Virtex-5. The 
DIMMs are mounted on the (b)-side with few parts to obstruct ion beams 
incident at oblique angles. 


One FBGA on each DIMM was thinned to between 120 and 
200 j-im and the DIMM was mounted on the evaluation board. 
The tester (evaluation board + DIMM) was controlled by 
means of a computer via a National Instruments LabVIEW 
software interface, which controlled the tester power. As 
mentioned above, power to the DIMM is supplied via the 


evaluation board, so there is no independent control of DUT 
power in the event of an overcurrent due to SEL or loss of 
control due to a SEFI. Thus, when a power cycle is required, 
the tester design must be reloaded into the configuration 
memory, and the DUT must be reprogrammed. 

Thinning of parts was carried out using an Ultra-Tec 
precision milling machine. This operation posed significant 
challenges due to the fragility of the FBGAs mounted on the 
DIMMs. The yield was less than 50% for the DDR2 devices. 
The three parts that survived thinning had thicknesses of -200, 
—190 and -140 /rm. The yield for DDR3 devices was slightly 
improved (-50%), but again, the thicknesses feasible were 
limited, especially if polishing of the die was needed, as for 
two-photon absorption (TP A) laser SEE testing. Thicknesses 
of DDR3 devices varied from 120 fxm (no polishing) to 
-200 /um (polished). These thicknesses were adequate for the 
25 MeV/amu beam tune at the Texas A&M University 
Cyclotron Facility (TAMU), as well as for the two-photon 
laser system at the Naval Research Laboratory (NRL). 
Thickness varied less than 10 /um over the die surface. 

III. From Evaluation Board to Tester 

Although the SDRAM evaluation boards are designed to 
operate DDR2/3s, they are not optimized for SEE testing. 
Probably the most significant drawback of using the evaluation 
boards is the lack of ability to control power to the DUT. This 
would be a serious drawback for potentially SEL susceptible 
parts. However, recent test results [5, 6, 7] have not shown 
SEL susceptibility in DDR2 and DDR3 SDRAMS. Moreover, 
while the need to cycle power to a SDRAM that has suffered a 
SEFI requires cycling power to the entire evaluation board and 
then reloading the test program, this can be accomplished in a 
matter of minutes. Finally, we decided that SEL susceptibility 
would disqualify a part from consideration, so it would not 
require a full characterization. 

Significant modification to the evaluation board IP was also 
required. This posed significant challenges, as the need for the 
controller to operate independently (without a processor) and 
to sample data on every clock cycle required significant 
amounts of re-/design, and the timing of the IP proved to be 
fragile. Even the language of the IP could be an issue. The 
DDR2 board was designed using VHSIC Hardware 
Description Language (VHDL, where VHSIC=very-high- 
speed integrated circuits), while the DDR3 board was designed 
with Verilog. Often, FPGA designers only use one language 
or the other. 

The yield issues for FBGA thinning discussed above posed 
a final challenge. Here, we were helped by the availability of 
the 25 MeV/amu tune at TAMU, since the greater range of 
these ions allowed less aggressive thinning of the parts. This, 
and a thinning/polishing strategy that used lower pressure and 
higher bit speed, allowed us to take multiple parts to each test. 

Despite these challenges, the evaluation boards were a 
much more economical and rapid solution than developing a 
new dedicated FPGA-based tester. Moreover, as attested by 
the volume of data gathered (>230 runs for both DDR2 and 
DDR3 tests in 24 hours), the evaluation boards proved to be 
reliable test platforms. 
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IV. DDR2 Testing 

We tested the DDR2s at TAMU using the 25-MeV/amu 
tune. Table I shows the ions used during testing, including 
their energies, ranges in Si and Linear Energy Transfer (LET) 
as they exit the beam pipe. The TAMU beam monitoring 
software estimates the LET at the sensitive volume, taking into 
account materials transited by the ions. All five ions were 
used for the DDR2 test. For each run, the thinned SDRAM 
was centered in the beam at the desired angle (tilt and roll), 
and the desired pattern was programmed into the memory and 
verified. Then the part was irradiated to the desired fluence or 
until it experienced a SEL, SEFI or other disruptive error. The 
errors were tallied during the irradiation for a dynamic test, 
and the errors were read at the end of the irradiation for a static 
test. Most of the testing was done dynamically, with a Counter 
pattern, where the memory contents were determined by the 
address in which they were stored. At the end of the test, the 
part functionality was verified, and run parameters and errors 
were recorded. Then the part was prepared for the next run. 
Testing was conducted at multiple angles of incidence when 
this was feasible (note: the thickness of some parts made 
testing beyond 45° to the normal impossible with Xe and Kr 
ions due to penetration range issues), and tilt and roll effects 
were compared for some ion/angle combinations. Most runs 
ended with a SEFI or a large block error that overflowed the 
First-In-First-Out (FIFO) memory. If the part recovered on its 
own, the error was called a block error. Otherwise, it was 
tallied as a SEFI, and in almost all cases, a hard reset 
(resynching of clock and reinitialization of the part) was 
required for recovery. In several cases, errors were found to 
persist in the DUT after the part was reset and reprogrammed. 
Most of these errors were due to stuck bits. However, in some 
cases, the continuing errors were due to a persistent SEFI that 
could only be cleared by cycling power to the affected device 
(and therefore to the entire tester). The SEE cross sections are 
plotted vs. effective LET, whether they conform to 
conventional effective LET dependence or not. 


Table I: Ion Beams used for Testing at TAMU 


Ion-Mass 

Energy 

(MeV 

Range 

(um) 

Incident LET 
(MeVcm 2 /mg) 

N-14 

347 

1009 

0.9 

Ne-22 

545 

799 

1.8 

Ar-40 

991 

493 

5.5 

Kr-84 

2081 

332 

19.8 

Xe-129 

3197 

286 

38.9 


V. DDR3 Laser and Heavy-Ion Testing 

We conducted initial SEE testing on the DDR3s using the 
TPA laser facility at NRL using a pulsed beam with a 1 .26 ^m 
wavelength. For this test, the DUT was imaged with near 
infrared (NIR) light, and the laser was directed to the portion 
of the die we wished to test. Fig. 3 shows an image of a 
portion of the die with both memory (large rectangles) and 
control logic on the bottom of the picture. The DUT was 


programmed, the laser fired, and the resulting behavior 
recorded. For some runs, we chose a region of the die and 
programmed the laser to fire at random points within this 
region. The random sampling provides a closer analog to 
heavy-ion testing and allows determination of the proportion 
of circuitry that exhibits errors. The most notable result of this 
test was evident at the test site — we were unable to induce 
upsets in the memory array portion of the circuit, even using 
high laser intensities. It is unclear whether this is attributable 
to the actual immunity of the memory cells, or whether the 
thickness of the die and possible small metal obstructions 
preclude placement of the laser beam spot in the sensitive 
volume of the DRAM cell. We also observed burst errors, 
block errors, as well as SEFIs. In most cases, a hard reset 
(resynchronizing clock plus reinitializing memory control 
logic) was required for recovery, although some errors 
required a power cycle, and a few recovered after a soft reset 
(reinitializing the chip with no resynchronization of the clock). 

Heavy-ion testing at TAMU was carried out for the DDR2s, 
although we made some test modifications based on our 
experience testing the DDR2s (e.g., testing with light ions first 
to minimize stuck bits due to multiple ion strikes). We 
performed the test in dynamic mode using a Counter pattern, 
which yielded the most information during DDR2 post- 
processing. Again, most runs ended in large block errors or 
SEFIs that required a hard reset for recovery. We observed 
some persistent SEFIs, where a power cycle was required for 
recovery. For ions that caused stuck bits, a rough tally was 
kept to monitor the health and performance of the DUT. 
Based on the experience with stuck bits in the DDR2, we 
began testing with Ne ions in order to better determine the 
LET onset for these errors. Since no errors were seen for Ne 
at normal incidence, we did not test with N ions. The DDR3s 
were significantly harder to all SEE than the DDR2s. Where 
possible (e.g., for Ne, Ar and Kr), runs were taken with ions 
incident obliquely on the die both for tilt and roll angles. We 
gathered over 230 data runs for the DDR3s. 







Fig. 3: Infrared micrograph of memory arrays (large rectangles near the top of 
the image) and control logic (smaller features near the bottom) of the DDR3 
SDRAM tested at the NRL laser laboratory. 
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VI. Data Analysis 

As expected for a SDRAM SEE test, data analysis was a 
complicated task. Although neither DDR2s nor DDR3s 
exhibited destructive failures, they did exhibit a full range of 
nondestructive SEE — Single-Event Upset (SEU), Multi-Bit 
Upset (MBU), Block/Burst errors, stuck bits and many SEFI 
modes. Even the DDR3s, which did not upset during laser 
testing, exhibited single-bit errors that are most easily 
understood as SEUs down to LET-3 MeV-cm 2 /mg. 

These error modes had to be identified and isolated from 
each other to avoid contaminating error rate estimates. To do 
this, we had to define each error category: 

• Stuck bit — a persistent single bit error fixed to the 
uncharged state that cannot be corrected even after a power 
cycle to the memory and so persists across at least two runs. 

• SEU/MBU — a correctable single or multiple-bit flip. 

© Block error — a series of errors in contiguous or related 
addresses ( e.g ., row or column error). 

• Burst error — a rapid series of errors that may or may not 
occur at related addresses (identified by temporal 
proximity — tallied with block errors for convenience). 

• SEFI — Complete or partial loss of functionality in the part 
due to an upset in the control logic or device timing that 
requires a reset of the DUT or a power cycle for recovery. 
The analysis considered only the portions of the run prior to 

the occurrence of a SEFI. For this portion of each run, first the 
stuck bits were counted and removed. There could be at most 
one SEFI per data run. Given the definition of a stuck bit, this 
could only be done in post processing. Likewise we removed 
and tallied the block and burst errors. The remaining errors 
were tallied to determine SEU counts for each run. Tallies for 
each error mode were combined for runs carried out under 
similar conditions to minimize random errors in the cross 
section. The results for tilt and roll angles of equal magnitude 
were compared and found not to vary significantly, so, these 
runs were also combined. The analysis resulted in cross 
section vs. LET curves for four different error modes for both 
DDR2s and DDR3s — SEU, block errors, SEFI and stuck bits. 
For stuck bits, the probability of multiple hits to a single bit 
was estimated based on the total fluence of ions that had been 
incident on the part. 


VII. DDR2 RESULTS 

Figs. 4-7 show the cross section vs. LET curves for SEU, 
block errors, SEFI and stuck bits. SEUs and SEFIs were seen 
for all test ions, including N at normal incidence. The best fit 
onset LET was 0.6 MeV-cm 2 /mg for SEUs, 0.9 MeV*cm 2 /mg 
for SEFIs, and -1.6 MeV*cm 2 /mg for block errors. The 
limiting cross section for SEUs was ~20x that for block errors, 
which was in turn about 12.5x the SEFI cross section. 
Moreover, these ratios persist over most of the range where 
errors were seen. This means that most runs had less than 100 
SEUs before they were ended by a SEFI or large block error. 
No MBUs were observed. In Fig. 7, it is likely that most if not 
all of the stuck bits seen at low LET arise from bits that had 
been struck by Xe or Kr ions in earlier runs. As such, the fit 
represents a worst case, and possibly an overly pessimistic 


one. Most stuck bits annealed within a few hours. However, 
some were still present two weeks later when the shipping 
containers arrived back at NASA Goddard. Also, the cross 
section curves for block errors and SEFIs seem to scale as 
expected with effective LET, while that for SEUs does not. 
The performance of these parts was consistent with 
reference [6]. 
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Fig. 4: SEU cross section vs. Effective LET for Samsung DDR2. Cross 
sections for ions incident off normal to the device (open symbols) do not 
scale as would be expected if effective LET held, especially at low LET. 



Fig. 5: Block error cross section vs. Effective LET for Samsung DDR2. 
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Fig. 6: SEFI cross section vs. Effective LET for Samsung DDR2. 


To be published in the Institute of Electrical and Electronics Engineers (IEEE) Transaction on Nuclear Science (TNS) Dec. 2013 and on https://nepp.nasa.gov. 



5 



0 20 40 60 80 100 

LET (MeVcm 2 /mg) 

. _ 

Fig. 7: Stuck bit cross section vs. Effective LET for Samsung DDR2. See 
note in § VII regarding the multiple DDR2 hits. 

VIII.DDR3 Laser and Heavy-Ion Results 

Figs. 8-11 show SEE cross section vs. LET curves for the 
Samsung DDR3s. The first thing one notices about the SEE 
performance of the DDR3 devices is that they were 
significantly harder than their DDR2 counterparts. No errors 
of any type were seen for Ne ions at normal incidence 
(LET=2.1 MeV cm 2 /mg). Limiting cross sections are also 
roughly an order of magnitude lower. Moreover, the SEU 
cross section vs. LET curve seems to scale roughly as expected 
with effective LET, in contrast to the DDR2s above. As with 
the DDR2s, there were no MBUs. 

Many of the SEFIs seen in the DDR3s were also of a 
different character, exhibiting a shift where the observed data 
corresponded to the expected data for the next address — 
perhaps indicating errors in counters or circuit timing. 
However, probably the most notable feature of the data 
presented here has to do with stuck bits. While SEU, block 
error and SEFI behavior were all similar for both DDR3s 
tested, the stuck bit cross section for DUT1 was ~30x higher 
than that for DUT3, and DUT1 saw errors for Kr ions as well 
as Xe ions. Again, although most stuck bits annealed within a 
matter of hours, some were still present when the parts arrived 
back from the test. 
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Fig. 8: SEU cross section vs. Effective LET for Samsung DDR3. 
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Fig. 9: Block error cross section vs. Effective LET for Samsung DDR3. 
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Fig. 10: SEFI cross section vs. Effective LET for Samsung DDR3 
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IX. Discussion 

The results presented above are consistent with recent trends 
in DDR2 and DDR3 SEE performance. Neither the DDR2 nor 
the DDR3 were susceptible to destructive SEE. SEU rates 
remain manageable, and since column, row and block sizes 
scale with the memory size, the proportion of errors due to 
block errors continues to grow. 

In the absence of destructive SEE susceptibility, SEFIs are 
the SEE mode of most concern, especially when they require a 
power cycle for recovery. Both the DDR2 and DDR3 
exhibited such SEFIs, albeit at a low rate. Lack of statistics 
precludes estimating the rate for such error modes, but about 
2% of SEFIs observed over all ions required a power cycle for 
recovery for both DDR2 and DDR3. 

Stuck bits continue to be problematic for both testing and 
for operation in space radiation environments. For the DDR2, 
we saw stuck bits even down to the lowest test LETs. 
However, the low LET runs were carried out with parts that 
had already received significant fluences of Kr and Xe ions 
(~2xl0 7 ions/cm 2 ). Thus, the low-LET stuck bits could be 
caused by multiple ions (heavy and light) striking the same bit. 
The cumulative fluences are sufficiently high that this 
interpretation make sense, and it also explains the difference 
between stuck bit results shown here and those of reference 
[6], where the onset LET for stuck bits was 22 MeV-cm 2 /mg. 
The stuck bit results for the DDR3 devices were also 
notable — mainly for the discrepancy between the susceptibility 
of the two DUTs. Prior irradiation history cannot explain the 
difference, and there appears to be a significant part-to-part 
variation. More thorough understanding of this variation is 
warranted. 

Finally, the fact that SEUs were not observed during laser 
testing coupled with the fact that the SEU cross section scales 
as expected with effective LET for DDR3s, but does not scale 
with effective LET for DDR2s, suggests different mechanisms 
for the single-bit errors in the DDR2s and DDR3s. 

The sheer volume of data gathered for both DDR2 and 
DDR3 devices attest to the reliability and performance of the 
evaluation boards as SEE testers — once suitable modifications 
had been made to their IP. As expected, no SEL or other high- 
current anomalies were seen, so the lack of ability to control 
power directly to the DUT posed no obstacles to gathering 
data. The strategy proved especially useful for testing DDR3s 
at speed (666.5 MHz input frequency and 1333 MHz data rate) 
without spending significant resources to design and build a 
tester capable of such data rates. This suggests that the 
technique could be very valuable as a first look when a new 
DDR generation or speed becomes available. The use of 
DIMMs as test parts has also proven feasible. Although initial 
attempts to thin a single FBGA on the DIMMs resulted in low 


yields due to the fragility of the DRAM die, reduced pressure 
and higher bit speed resulted in improved yield for the DDR3 
DIMMs. Again, due to their easier availability, the use of 
DIMMs may be well-suited to a first look to compare SEE 
susceptibilities across multiple candidate parts, especially 
since evaluation board should be able to accommodate any 
DIMM of the same specification, regardless of the vendor of 
the FBGAs on the DIMM. 

X. Conclusions and Recommendations 

We carried out SEE testing of DDR2 and DDR3 SDRAMs 
using DIMMs as test parts and commercial FPGA-based 
evaluation boards as SEE testers. Although the IP for the 
evaluation boards required significant modification, the 
resulting testers performed reliably throughout the test 
campaigns allowing us to amass large SEE datasets for both 
the DDR2 and DDR3 SDRAMs. The resulting data showed 
that both memories were immune to SEE-induced failures. In 
addition, Samsung DDR3 SDRAMs seem to be harder to 
single-event effects than their DDR2 counterparts, both in 
terms of onset LET and limiting cross section for SEUs, block 
errors and SEFIs. The nature of SEUs in the DDR3s seems to 
be quite different from those in DDR2 devices. Stuck-bit 
susceptibility continues to be a wild card in SDRAMs and 
deserves further investigation — to better determine onset LET 
for the DDR2 and to better understand the part-to-part 
variation in stuck-bit susceptibility in DDR3s. We anticipate 
that the evaluation board/DIMM strategy will prove capable of 
carrying out such studies, and hope that it will also be helpful 
for investigating SEE susceptibility in future generations of 
DDR SDRAM technologies. 
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